Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 14(1): 1400, 2024 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-38228685

RESUMO

Mass spectra, which are agglomerations of ionized fragments from targeted molecules, play a crucial role across various fields for the identification of molecular structures. A prevalent analysis method involves spectral library searches, where unknown spectra are cross-referenced with a database. The effectiveness of such search-based approaches, however, is restricted by the scope of the existing mass spectra database, underscoring the need to expand the database via mass spectra prediction. In this research, we propose the Motif-based Mass Spectrum prediction Network (MoMS-Net), a GNN-based architecture to predict the mass spectra pattern utilizing the structural motif information of the molecule. MoMS-Net considers both a molecule and its substructures as a graph form, which facilitates the incorporation of long-range dependencies while using less memory compared to the graph transformer model. We evaluated our model over various types of mass spectra and showed the validity and superiority over the conventional models.

2.
ACS Omega ; 8(42): 39759-39769, 2023 Oct 24.
Artigo em Inglês | MEDLINE | ID: mdl-37901490

RESUMO

In recent years, molecular representation learning has emerged as a key area of focus in various chemical tasks. However, many existing models fail to fully consider the geometric information on molecular structures, resulting in less intuitive representations. Moreover, the widely used message passing mechanism is limited to providing the interpretation of experimental results from a chemical perspective. To address these challenges, we introduce a novel transformer-based framework for molecular representation learning, named the geometry-aware transformer (GeoT). The GeoT learns molecular graph structures through attention-based mechanisms specifically designed to offer reliable interpretability as well as molecular property prediction. Consequently, the GeoT can generate attention maps of the interatomic relationships associated with training objectives. In addition, the GeoT demonstrates performance comparable to that of MPNN-based models while achieving reduced computational complexity. Our comprehensive experiments, including an empirical simulation, reveal that the GeoT effectively learns chemical insights into molecular structures, bridging the gap between artificial intelligence and molecular sciences.

3.
Front Cardiovasc Med ; 10: 1167468, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37416918

RESUMO

Background: Although coronary computed tomography angiography (CCTA) is currently utilized as the frontline test to accurately diagnose coronary artery disease (CAD) in clinical practice, there are still debates regarding its use as a screening tool for the asymptomatic population. Using deep learning (DL), we sought to develop a prediction model for significant coronary artery stenosis on CCTA and identify the individuals who would benefit from undergoing CCTA among apparently healthy asymptomatic adults. Methods: We retrospectively reviewed 11,180 individuals who underwent CCTA as part of routine health check-ups between 2012 and 2019. The main outcome was the presence of coronary artery stenosis of ≥70% on CCTA. We developed a prediction model using machine learning (ML), including DL. Its performance was compared with pretest probabilities, including the pooled cohort equation (PCE), CAD consortium, and updated Diamond-Forrester (UDF) scores. Results: In the cohort of 11,180 apparently healthy asymptomatic individuals (mean age 56.1 years; men 69.8%), 516 (4.6%) presented with significant coronary artery stenosis on CCTA. Among the ML methods employed, a neural network with multi-task learning (19 selected features), one of the DL methods, was selected due to its superior performance, with an area under the curve (AUC) of 0.782 and a high diagnostic accuracy of 71.6%. Our DL-based model demonstrated a better prediction than the PCE (AUC, 0.719), CAD consortium score (AUC, 0.696), and UDF score (AUC, 0.705). Age, sex, HbA1c, and HDL cholesterol were highly ranked features. Personal education and monthly income levels were also included as important features of the model. Conclusion: We successfully developed the neural network with multi-task learning for the detection of CCTA-derived stenosis of ≥70% in asymptomatic populations. Our findings suggest that this model may provide more precise indications for the use of CCTA as a screening tool to identify individuals at a higher risk, even in asymptomatic populations, in clinical practice.

4.
Nat Comput Sci ; 3(12): 1015-1022, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38177719

RESUMO

Data-driven deep learning algorithms provide accurate prediction of high-level quantum-chemical molecular properties. However, their inputs must be constrained to the same quantum-chemical level of geometric relaxation as the training dataset, limiting their flexibility. Adopting alternative cost-effective conformation generative methods introduces domain-shift problems, deteriorating prediction accuracy. Here we propose a deep contrastive learning-based domain-adaptation method called Local Atomic environment Contrastive Learning (LACL). LACL learns to alleviate the disparities in distribution between the two geometric conformations by comparing different conformation-generation methods. We found that LACL forms a domain-agnostic latent space that encapsulates the semantics of an atom's local atomic environment. LACL achieves quantum-chemical accuracy while circumventing the geometric relaxation bottleneck and could enable future application scenarios such as inverse molecular engineering and large-scale screening. Our approach is also generalizable from small organic molecules to long chains of biological and pharmacological molecules.


Assuntos
Algoritmos , Engenharia , Conformação Molecular , Relaxamento , Semântica
5.
Front Comput Neurosci ; 16: 1062678, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36465966

RESUMO

Backpropagation has been regarded as the most favorable algorithm for training artificial neural networks. However, it has been criticized for its biological implausibility because its learning mechanism contradicts the human brain. Although backpropagation has achieved super-human performance in various machine learning applications, it often shows limited performance in specific tasks. We collectively referred to such tasks as machine-challenging tasks (MCTs) and aimed to investigate methods to enhance machine learning for MCTs. Specifically, we start with a natural question: Can a learning mechanism that mimics the human brain lead to the improvement of MCT performances? We hypothesized that a learning mechanism replicating the human brain is effective for tasks where machine intelligence is difficult. Multiple experiments corresponding to specific types of MCTs where machine intelligence has room to improve performance were performed using predictive coding, a more biologically plausible learning algorithm than backpropagation. This study regarded incremental learning, long-tailed, and few-shot recognition as representative MCTs. With extensive experiments, we examined the effectiveness of predictive coding that robustly outperformed backpropagation-trained networks for the MCTs. We demonstrated that predictive coding-based incremental learning alleviates the effect of catastrophic forgetting. Next, predictive coding-based learning mitigates the classification bias in long-tailed recognition. Finally, we verified that the network trained with predictive coding could correctly predict corresponding targets with few samples. We analyzed the experimental result by drawing analogies between the properties of predictive coding networks and those of the human brain and discussing the potential of predictive coding networks in general machine learning.

6.
ACS Omega ; 7(5): 4234-4244, 2022 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-35155916

RESUMO

A molecule is a complex of heterogeneous components, and the spatial arrangements of these components determine the whole molecular properties and characteristics. With the advent of deep learning in computational chemistry, several studies have focused on how to predict molecular properties based on molecular configurations. MA message-passing neural network provides an effective framework for capturing molecular geometric features with the perspective of a molecule as a graph. However, most of these studies assumed that all heterogeneous molecular features, such as atomic charge, bond length, or other geometric features, always contribute equivalently to the target prediction, regardless of the task type. In this study, we propose a dual-branched neural network for molecular property prediction based on both the message-passing framework and standard multilayer perceptron neural networks. Our model learns heterogeneous molecular features with different scales, which are trained flexibly according to each prediction target. In addition, we introduce a discrete branch to learn single-atom features without local aggregation, apart from message-passing steps. We verify that this novel structure can improve the model performance. The proposed model outperforms other recent models with sparser representations. Our experimental results indicate that, in the chemical property prediction tasks, the diverse chemical nature of targets should be carefully considered for both model performance and generalizability. Finally, we provide the intuitive analysis between the experimental results and the chemical meaning of the target.

7.
Clin Cancer Res ; 26(24): 6513-6522, 2020 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-33028590

RESUMO

PURPOSE: Multigene assays provide useful prognostic information regarding hormone receptor (HR)-positive breast cancer. Next-generation sequencing (NGS)-based platforms have numerous advantages including reproducibility and adaptability in local laboratories. This study aimed to develop and validate an NGS-based multigene assay to predict the distant recurrence risk. EXPERIMENTAL DESIGN: In total, 179 genes including 30 reference genes highly correlated with the 21-gene recurrence score (RS) algorithm were selected from public databases. Targeted RNA-sequencing was performed using 250 and 93 archived breast cancer samples with a known RS in the training and verification sets, respectively, to develop the algorithm and NGS-Prognostic Score (NGS-PS). The assay was validated in 413 independent samples with long-term follow-up data on distant metastasis. RESULTS: In the verification set, the NGS-PS and 21-gene RS displayed 91.4% concurrence (85/93 samples). In the validation cohort of 413 samples, area under the receiver operating characteristic curve plotted using NGS-PS values classified for distant recurrence was 0.76. The best NGS-PS cut-off value predicting distant metastasis was 20. Furthermore, 269 and 144 patients were classified as low- and high-risk patients in accordance with the cut-off. Five- and 10-year estimates of distant metastasis-free survival (DMFS) for low- versus high-risk groups were 97.0% versus 77.8% and 93.2% versus 64.4%, respectively. The age-related HR for distant recurrence without chemotherapy was 9.73 (95% CI, 3.59-26.40) and 3.19 (95% CI, 1.40-7.29) for patients aged ≤50 and >50 years, respectively. CONCLUSIONS: The newly developed and validated NGS-based multigene assay can predict the distant recurrence risk in ER-positive, HER2-negative breast cancer.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Biomarcadores Tumorais/genética , Neoplasias da Mama/patologia , Recidiva Local de Neoplasia/patologia , Receptor ErbB-2/metabolismo , Receptores de Estrogênio/metabolismo , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Feminino , Seguimentos , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Pessoa de Meia-Idade , Recidiva Local de Neoplasia/tratamento farmacológico , Recidiva Local de Neoplasia/genética , Recidiva Local de Neoplasia/metabolismo , Prognóstico , Estudos Prospectivos , Estudos Retrospectivos , Taxa de Sobrevida
8.
Methods ; 179: 65-72, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32445695

RESUMO

Drug metabolism is determined by the biochemical and physiological properties of the drug molecule. To improve the performance of a drug property prediction model, it is important to extract complex molecular dynamics from limited data. Recent machine learning or deep learning based models have employed the atom- and bond-type information, as well as the structural information to predict drug properties. However, many of these methods can be used only for the graph representations. Message passing neural networks (MPNNs) (Gilmer et al., 2017) is a framework used to learn both local and global features from irregularly formed data, and is invariant to permutations. This network performs an iterative message passing (MP) operation on each object and its neighbors, and obtain the final output from all messages regardless of their order. In this study, we applied the MP-based attention network (Nikolentzos et al., 2019) originally developed for text learning to perform chemical classification tasks. Before training, we tokenized the characters, and obtained embeddings of each molecular sequence. We conducted various experiments to maximize the predictivity of the model. We trained and evaluated our model using various chemical classification benchmark tasks. Our results are comparable to previous state-of-the-art and baseline models or outperform. To the best of our knowledge, this is the first attempt to learn chemical strings using an MP-based algorithm. We will extend our work to more complex tasks such as regression or generation tasks in the future.


Assuntos
Quimioinformática/métodos , Química Farmacêutica/métodos , Aprendizado Profundo , Farmacologia Clínica/métodos , Previsões/métodos , Humanos
9.
BMC Bioinformatics ; 20(1): 521, 2019 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-31655545

RESUMO

BACKGROUND: Quantitative structure-activity relationship (QSAR) is a computational modeling method for revealing relationships between structural properties of chemical compounds and biological activities. QSAR modeling is essential for drug discovery, but it has many constraints. Ensemble-based machine learning approaches have been used to overcome constraints and obtain reliable predictions. Ensemble learning builds a set of diversified models and combines them. However, the most prevalent approach random forest and other ensemble approaches in QSAR prediction limit their model diversity to a single subject. RESULTS: The proposed ensemble method consistently outperformed thirteen individual models on 19 bioassay datasets and demonstrated superiority over other ensemble approaches that are limited to a single subject. The comprehensive ensemble method is publicly available at http://data.snu.ac.kr/QSAR/ . CONCLUSIONS: We propose a comprehensive ensemble method that builds multi-subject diversified models and combines them through second-level meta-learning. In addition, we propose an end-to-end neural network-based individual classifier that can automatically extract sequential features from a simplified molecular-input line-entry system (SMILES). The proposed individual models did not show impressive results as a single model, but it was considered the most important predictor when combined, according to the interpretation of the meta-learning.


Assuntos
Relação Quantitativa Estrutura-Atividade , Descoberta de Drogas/métodos , Aprendizado de Máquina
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...